17 research outputs found

    Theory of composable latency-insensitive refinements

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 36-37).Simulation of a synchronous system on a hardware platform, for example an FPGA, can be performed using a hardware prototype of the system. But the prototype may not meet the resource and timing constraints of that platform. One way to meet the constraints is to partition the prototype hierarchically into modules, and to refine the individual modules while preserving the overall behavior of the system. In this thesis we formalize the notion of a refinement that preserves the behavior of the original modules - we call such refinements latency-insensitive refinements. We show that if these latency-insensitive refinements of the modules obey certain conditions, then these refinements can be composed together hierarchically in order to obtain the latency-insensitive refinement of the original system. We call the latency-insensitive refinements that obey these conditions as composable latency-insensitive refinements. We also give a procedure to automatically transform a module to a latency-insensitive refinement while obeying the conditions that enable it to be composed hierarchically. The transformation serves as a starting point for making further refinements and optimizations, and thus, gives a methodology to design hardware simulators for synchronous systems.by Muralidaran Vijayaraghavan.S.M

    A general technique for deterministic model-cycle-level debugging

    Get PDF
    Efficient use of FPGA resources requires FPGA-based performance models of complex hardware to implement one model cycle, i.e., one time-step of the original synchronous system, in several implementation cycles. Generally implementation cycles have no simple relationship with model cycles, and it is tricky to reconstruct the state of the synchronous system at the model-cycle boundaries if only implementation-cycle-level control and information is provided. A good debugging facility needs to provide: complete control over the functioning of the target design being simulated; fast and easy access to all the significant target design state for both monitoring and modification; and some means of accomplishing deterministic execution when the target design is a multicore processor running a parallel application. Moreover, these features need to be provided in a manner which does not incur substantial resource and performance penalties. In this paper, we present a debugging technique based on the LI-BDN theory. We show how the technique facilitates deterministic model-cycle-level debugging. We used it to build the debugging infrastructure for Arete, which is an FPGA-based cycle-accurate multicore simulator. The resource and performance penalties of our debugging technique are minimal; in Arete the debugging infrastructure has area and performance overheads of 5% and 6%, respectively.IBM Researc

    Modular verification of hardware systems

    No full text
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.Cataloged from PDF version of thesis.Includes bibliographical references (pages 189-196).As hardware systems are becoming bigger and more complex, it is becoming increasingly harder to design and reason about these systems in a monolithic fashion. While hardware is often designed in a modular manner, its verification is rarely performed modularly. Moreover, any modular refinement to an existing system requires a full system verification to guarantee correctness, even if only a few components of the system have been refined. In this thesis, we present a new framework for modular verification of hardware designs written in the Bluespec language. That is, we formalize the idea of components in a hardware design, with well-defined input and output channels; and we show how to specify and verify components individually. For verifying a full system, we show how the proofs of its components can be composed, treating the components as black-boxes and not looking beyond their specifications. As a demonstration of this methodology, we verify a fairly realistic implementation of a multicore shared-memory system with two types of components: memory system and processor, with machine-checked proofs in the Coq proof assistant. Both components include nontrivial optimizations, with the memory system employing an arbitrary hierarchy of cache nodes that communicate with each other concurrently, and with the processor doing speculative execution of many concurrent read operations. Nonetheless, we prove that the combined system implements sequential consistency. To our knowledge, our memory-system proof is the first machine verification of a cache-coherence protocol parameterized over an arbitrary cache hierarchy, and our full-system proof is the first machine verification of sequential consistency for a multicore hardware design that includes caches and speculative processors. We also embed the Bluespec language in the Coq proof assistant and formalize its modular semantics, enabling a verification engineer to obtain machine-checked proofs for Bluespec designs using our framework.by Muralidaran Vijayaraghavan.Ph. D

    Bounded Dataflow Networks and Latency-Insensitive Circuits

    No full text
    We present a theory for modular refinement of Synchronous Sequential Circuits (SSMs) using Bounded Dataflow Networks (BDNs). We provide a procedure for implementing any SSM into an LI-BDN, a special class of BDNs with some good compositional properties. We show that the Latency-Insensitive property of LI-BDNs is preserved under parallel and iterative composition of LI-BDNs. Our theory permits one to make arbitrary cuts in an SSM and turn each of the parts into LI-BDNs without affecting the overall functionality. We can further refine each constituent LI-BDN into another LI-BDN which may take different number of cycles to compute. If the constituent LI-BDN is refined correctly we guarantee that the overall behavior would be cycle-accurate with respect to the original SSM. Thus one can replace, say a 3-ported register file in an SSM by a one-ported register file without affecting the correctness of the SSM. We give several examples to show how our theory supports a generalization of previous techniques for Latency-Insensitive refinements of SSMs.Intel CorporationNational Science Foundation (U.S.) (grant Generating High-Quality Complex Digital Systems from High-level Specification (No. 0541164)

    Modular Compilation of Guarded Atomic Actions

    No full text
    Abstract-Over the last decade, Bluespec, a hardware description language of guarded atomic actions has been used to describe rapidly modifiable, modular, no-compromise hardware designs and generate circuits from them. While the language itself supports significant modularity, the compiler compiles a module with other modules as parameters by in-lining or flattening the module. This forces the user to either suffer large compile times or to change the modular structure of the design. In this paper we propose a new modular compilation scheme which supports compilation of modules with interface methods as parameters and preserves Bluespec's one-rule-at-a-time semantic model. This compilation process inherently requires the distributed scheduling of rules

    A general technique for deterministic model-cycle-level debugging

    No full text
    Abstract-Efficient use of FPGA resources requires FPGA-based performance models of complex hardware to implement one model cycle, i.e., one time-step of the original synchronous system, in several implementation cycles. Generally implementation cycles have no simple relationship with model cycles, and it is tricky to reconstruct the state of the synchronous system at the modelcycle boundaries if only implementation-cycle-level control and information is provided. A good debugging facility needs to provide: complete control over the functioning of the target design being simulated; fast and easy access to all the significant target design state for both monitoring and modification; and some means of accomplishing deterministic execution when the target design is a multicore processor running a parallel application. Moreover, these features need to be provided in a manner which does not incur substantial resource and performance penalties. In this paper, we present a debugging technique based on the LI-BDN theory. We show how the technique facilitates deterministic model-cycle-level debugging. We used it to build the debugging infrastructure for Arete, which is an FPGA-based cycle-accurate multicore simulator. The resource and performance penalties of our debugging technique are minimal; in Arete the debugging infrastructure has area and performance overheads of 5% and 6%, respectively
    corecore